Clusters Beat Trend!? Testing Feature Hierarchy in Statistical Graphics
Susan VanderPlas
Iowa State University
Graphics and Perception
The greatest value of a picture is when it forces us to notice what we never expected to see.
John Tukey
Gestalt Laws of Perception

The whole is different than the sum of the parts
-
Rules that make sense of complex visual information using experience
-
Information organized hierarchically
-
Subconscious process to order and group visual input
Gestalt Plots

How do plot aesthetics (color, shape, trend lines, error bands) change our perception the plotted data?
Statistical Lineups
Which plot is the most different?
Null plot data is from a data-generating method consistent with the null hypothesis
The nullabor package helps with null data creation
Two-Target Lineups
-
Modify lineup protocol for tests of competing hypotheses \(H_1\) and \(H_2\)
-
\(H_1\) and \(H_2\) target plots
-
18 null plots generated using a mixture model consistent with \(H_0\)
5, 12
Data Generating Mechanism
- Generate data from a linear model \(M_T\) (trend)
- Generate data from a \(k\) cluster model \(M_C\)
- Generate null data from a mixture model \(M_0\) with \(n_c\) observations from \(M_C\) and \(n_t = N - n_c\) observations from \(M_T\)
Linear Model
Parameter: \(\sigma_T\), the amount of variability around the trend line
- Generate evenly spaced \(x_i\) in \([-1, 1]\)
- Jitter \(x_i\)
- Generate \(y_i = x_i + e_i\), \(e_i \sim N(0, \sigma_T^2)\)
- Center and scale \(x_i, y_i\)
Linear Model

Cluster Model
Parameters: \(K\) clusters, \(\sigma_C\) cluster variability
- Generate \(K\) cluster centers \(c^x,c^y\) on a \(K\times K\) grid such that \(cor(c^x, c^y) \in [.25, .75]\)
- Center and standardize \(c^x, c^y\)
- Determine cluster size \(g_1, ..., g_K \sim Multinomial(K, p)\)
- Generate points around cluster centers: \((x_i, y_i) = (c^x_{g_i}, c^y_{g_i}) + (e_i^x, e_i^y)\) where \(e_i \sim N(0, \sigma_c^2)\)
- Center and scale \(x_i, y_i\)
Cluster Model

Mixture Model
\(n_c\) points from \(M_C\), \(N - n_c = n_T\) points from \(M_T\), where \(n_c \sim Binomial(N, \lambda)\)
Groups created by k-means clustering
Mixture Model

Experimental Design - Data Parameters
- \(K = 3, 5\)
- \(N = 15 K\)
- \(\sigma_T = 0.25, 0.35, 0.45\)
- \(\sigma_C = \begin{array}{cc}0.25, 0.30, 0.35 (K = 3)\\0.20, 0.25, 0.30 (K = 5)\end{array}\)
- \(\lambda = 0.5\)
18 combinations of plot parameters (\(2K \times 3\sigma_T \times 3\sigma_C\))
3 replicates of each parameter set 54 total lineup data sets
Experimental Design - Plot Aesthetics

10 Aesthetics \(\times\) 54 data sets = 540 plots
Experimental Design
- 1201 participants from Mechanical Turk
- Each participant evaluates 10 plots (1201 evaluations)
- Each \(\sigma_C \times \sigma_T\) value with one replicate, randomized across \(K\) values
- All 10 aesthetic types
- Participants select the plot or plots which are most different
- Provide a short explanation
- Rate confidence level
Results

Most participants identified a mix of cluster and trend targets
Results

Faceoff Model
- Examine trials in which participants identified at least one target (9959)
- Compare P(select cluster target) to P(select trend target)
$ C_{ijk} := {kji}$
Faceoff Model
\[\text{logit} P(C_{ijk}|C_{ijk}\cup T_{ijk}) = \mathbf{W}\alpha + \mathbf{X}\beta + \mathbf{J}\gamma + \mathbf{K}\eta\]
- \(\alpha\): vector of fixed effects describing the effect of data parameters \(\sigma_C,\sigma_T, K\)
- \(\beta\): vector of fixed effects describing the effect of aesthetics
- \(\gamma_j\): random effect of dataset, \(\gamma_j\overset{iid}{\sim} N(0, \sigma^2_{\text{data}})\)
- \(eta_k\): random effect of participantm \(\eta_k\overset{iid}{\sim} N(0, \sigma^2_{\text{participant}})\)
- \(\epsilon_{ijk}\): error associated with single evaluation of plot \(ij\) by participant \(k\), \(\epsilon_{ijk}\sim N(0, \sigma^2_e)\)
Faceoff Model
